59 research outputs found

    Statistical Mechanics of Community Detection

    Full text link
    Starting from a general \textit{ansatz}, we show how community detection can be interpreted as finding the ground state of an infinite range spin glass. Our approach applies to weighted and directed networks alike. It contains the \textit{at hoc} introduced quality function from \cite{ReichardtPRL} and the modularity QQ as defined by Newman and Girvan \cite{Girvan03} as special cases. The community structure of the network is interpreted as the spin configuration that minimizes the energy of the spin glass with the spin states being the community indices. We elucidate the properties of the ground state configuration to give a concise definition of communities as cohesive subgroups in networks that is adaptive to the specific class of network under study. Further we show, how hierarchies and overlap in the community structure can be detected. Computationally effective local update rules for optimization procedures to find the ground state are given. We show how the \textit{ansatz} may be used to discover the community around a given node without detecting all communities in the full network and we give benchmarks for the performance of this extension. Finally, we give expectation values for the modularity of random graphs, which can be used in the assessment of statistical significance of community structure

    Partitioning and modularity of graphs with arbitrary degree distribution

    Full text link
    We solve the graph bi-partitioning problem in dense graphs with arbitrary degree distribution using the replica method. We find the cut-size to scale universally with . In contrast, earlier results studying the problem in graphs with a Poissonian degree distribution had found a scaling with ^1/2 [Fu and Anderson, J. Phys. A: Math. Gen. 19, 1986]. The new results also generalize to the problem of q-partitioning. They can be used to find the expected modularity Q [Newman and Grivan, Phys. Rev. E, 69, 2004] of random graphs and allow for the assessment of statistical significance of the output of community detection algorithms.Comment: Revised version including new plots and improved discussion of some mathematical detail

    Increase in consumption of alcohol-based hand rub in German acute care hospitals over a 12 year period

    Get PDF
    Background: Hand hygiene plays a crucial role in the transmission of pathogens and the prevention of healthcare-associated infections. In 2007, a voluntary national electronic surveillance tool for the documentation of consumption of alcohol-based hand rub (AHC) was introduced as a surrogate for hand hygiene compliance (HAND-KISS) and for the provision of benchmark data as feedback.The aim of the study was to determine the trend in alcohol-based hand rub consumption between 2007 and 2018. Materials and methods: In this cohort study, AHC and patient days (PD) were documented on every ward in participating hospitals by trained local staff. Data was collected and validated in HAND-KISS. Intensive care units (ICU), intermediate care units (IMC), and regular wards (RW) that provided data during the study period between 2007 until 2018 were included into the study. Results: In 2018, 75.2% of acute care hospitals in Germany (n=1.460) participated. On ICUs (n=1998) mean AHC increased 1.74 fold (95%CI 1.71, 1.76; p<.0001) from 79.2ml/PD to 137.4ml/PD. On IMCs (n=475) AHC increased 1.69 fold (95%CI 1.60, 1.79; p<.0001) from 41.4ml/PD to 70.6ml /PD..On RWs (n=14,857) AHC was 19.0ml/PD in 2007 and increased 1.71 fold (95%CI 1.70, 1.73; p<.0001) to 32.6ml/PD in 2018. Conclusions: AHC in German hospitals increased on all types of wards during the past 12years. Surveillance of AHC is widely established in German hospitals. Large differences among medical specialties exist and warrant further investigation

    eBay users form stable groups of common interest

    Full text link
    Market segmentation of an online auction site is studied by analyzing the users' bidding behavior. The distribution of user activity is investigated and a network of bidders connected by common interest in individual articles is constructed. The network's cluster structure corresponds to the main user groups according to common interest, exhibiting hierarchy and overlap. Key feature of the analysis is its independence of any similarity measure between the articles offered on eBay, as such a measure would only introduce bias in the analysis. Results are compared to null models based on random networks and clusters are validated and interpreted using the taxonomic classifications of eBay categories. We find clear-cut and coherent interest profiles for the bidders in each cluster. The interest profiles of bidder groups are compared to the classification of articles actually bought by these users during the time span 6-9 months after the initial grouping. The interest profiles discovered remain stable, indicating typical interest profiles in society. Our results show how network theory can be applied successfully to problems of market segmentation and sociological milieu studies with sparse, high dimensional data.Comment: Major revision of the manuscript. Methodological improvements and inclusion of analysis of temporal development of user interests. 19 pages, 12 figures, 5 table

    A Statistical Performance Analysis of Graph Clustering Algorithms

    Get PDF
    Measuring graph clustering quality remains an open problem. Here, we introduce three statistical measures to address the problem. We empirically explore their behavior under a number of stress test scenarios and compare it to the commonly used modularity and conductance. Our measures are robust, immune to resolution limit, easy to intuitively interpret and also have a formal statistical interpretation. Our empirical stress test results confirm that our measures compare favorably to the established ones. In particular, they are shown to be more responsive to graph structure, less sensitive to sample size and breakdowns during numerical implementation and less sensitive to uncertainty in connectivity. These features are especially important in the context of larger data sets or when the data may contain errors in the connectivity patterns

    Information Symmetries in Irreversible Processes

    Full text link
    We study dynamical reversibility in stationary stochastic processes from an information theoretic perspective. Extending earlier work on the reversibility of Markov chains, we focus on finitary processes with arbitrarily long conditional correlations. In particular, we examine stationary processes represented or generated by edge-emitting, finite-state hidden Markov models. Surprisingly, we find pervasive temporal asymmetries in the statistics of such stationary processes with the consequence that the computational resources necessary to generate a process in the forward and reverse temporal directions are generally not the same. In fact, an exhaustive survey indicates that most stationary processes are irreversible. We study the ensuing relations between model topology in different representations, the process's statistical properties, and its reversibility in detail. A process's temporal asymmetry is efficiently captured using two canonical unifilar representations of the generating model, the forward-time and reverse-time epsilon-machines. We analyze example irreversible processes whose epsilon-machine presentations change size under time reversal, including one which has a finite number of recurrent causal states in one direction, but an infinite number in the opposite. From the forward-time and reverse-time epsilon-machines, we are able to construct a symmetrized, but nonunifilar, generator of a process---the bidirectional machine. Using the bidirectional machine, we show how to directly calculate a process's fundamental information properties, many of which are otherwise only poorly approximated via process samples. The tools we introduce and the insights we offer provide a better understanding of the many facets of reversibility and irreversibility in stochastic processes.Comment: 32 pages, 17 figures, 2 tables; http://csc.ucdavis.edu/~cmg/compmech/pubs/pratisp2.ht

    Comparative Study for Inference of Hidden Classes in Stochastic Block Models

    Full text link
    Inference of hidden classes in stochastic block model is a classical problem with important applications. Most commonly used methods for this problem involve na\"{\i}ve mean field approaches or heuristic spectral methods. Recently, belief propagation was proposed for this problem. In this contribution we perform a comparative study between the three methods on synthetically created networks. We show that belief propagation shows much better performance when compared to na\"{\i}ve mean field and spectral approaches. This applies to accuracy, computational efficiency and the tendency to overfit the data.Comment: 8 pages, 5 figures AIGM1

    The interplay of microscopic and mesoscopic structure in complex networks

    Get PDF
    Not all nodes in a network are created equal. Differences and similarities exist at both individual node and group levels. Disentangling single node from group properties is crucial for network modeling and structural inference. Based on unbiased generative probabilistic exponential random graph models and employing distributive message passing techniques, we present an efficient algorithm that allows one to separate the contributions of individual nodes and groups of nodes to the network structure. This leads to improved detection accuracy of latent class structure in real world data sets compared to models that focus on group structure alone. Furthermore, the inclusion of hitherto neglected group specific effects in models used to assess the statistical significance of small subgraph (motif) distributions in networks may be sufficient to explain most of the observed statistics. We show the predictive power of such generative models in forecasting putative gene-disease associations in the Online Mendelian Inheritance in Man (OMIM) database. The approach is suitable for both directed and undirected uni-partite as well as for bipartite networks

    Orientation bias of optically selected galaxy clusters and its impact on stacked weak-lensing analyses

    Get PDF
    Weak-lensing measurements of the averaged shear profiles of galaxy clusters binned by some proxy for cluster mass are commonly converted to cluster mass estimates under the assumption that these cluster stacks have spherical symmetry. In this paper, we test whether this assumption holds for optically selected clusters binned by estimated optical richness. Using mock catalogues created from N-body simulations populated realistically with galaxies, we ran a suite of optical cluster finders and estimated their optical richness. We binned galaxy clusters by true cluster mass and estimated optical richness and measure the ellipticity of these stacks. We find that the processes of optical cluster selection and richness estimation are biased, leading to stacked structures that are elongated along the line of sight. We show that weak-lensing alone cannot measure the size of this orientation bias. Weak-lensing masses of stacked optically selected clusters are overestimated by up to 3–6 per cent when clusters can be uniquely associated with haloes. This effect is large enough to lead to significant biases in the cosmological parameters derived from large surveys like the Dark Energy Survey, if not calibrated via simulations or fitted simultaneously. This bias probably also contributes to the observed discrepancy between the observed and predicted Sunyaev–Zel’dovich signal of optically selected clusters
    • …
    corecore